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SPATIALLY-EXPLICIT MODELS FOR INFERENCE 
ABOUT DENSITY IN UNMARKED POPULATIONS 

By Richard B. Chandler* and J. Andrew Royle* 
USGS Patuxent Wildlife Research Center* 

Recently-developed spatial capture-recapture (SCR) methods rep- 
resent a major advance over traditional capture-capture methods be- 
cause they yield explicit estimates of animal density instead of pop- 
ulation size within an unknown area, and they account for hetero- 
geneity in capture probability arising from the juxtaposition of indi- 
viduals and sample locations. Although the utility of SCR methods 
is gaining recognition, the requirement that all individuals can be 
uniquely identified excludes their use in many contexts. In this pa- 
per, we develop models for situations in which individual recognition 
is not possible, thereby allowing SCR methods to be applied in stud- 
ies of unmarked or partially-marked populations. The data required 
for our model are spatially-referenced counts made on one or more 
sample occasions at a collection of closely-spaced sample units such 
that individuals can be encountered at multiple locations. Our ap- 
proach utilizes the spatial correlation in counts as information about 
the location of individual activity centers, which enables estimation 
of density and distance-related heterogeneity in detection. Camera- 
traps, hair snares, track plates, sound recordings, and even point 
counts can yield spatially-correlated count data, and thus our model 
is widely applicable. A simulation study demonstrated that while the 
posterior distribution of abundance or density is strongly skewed in 
small samples, the posterior mode is an accurate point estimator as 
long as the trap spacing is not too large relative to scale parameter 
(cr) of the detection function. Marking a subset of the population can 
lead to substantial reductions in posterior skew and increased pos- 
terior precision. We also fit the model to point count data collected 
on the northern parula {Parula americana), and obtained a density 
estimate (posterior mode) of 0.38 (95% CI: 0.19, 1.64) birds/ha. Our 
paper challenges sampling and analytical conventions by demonstrat- 
ing that neither spatial independence nor individual recognition is 
needed to estimate population density — rather, spatial dependence 
induced by design can be informative about individual distribution 
and density. 
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counts, spatial capture-recapture, spatial point process, population density, R package 
unmarked 



1 



2 



CHANDLER AND ROYLE 



1. Introduction. Estimates of population density are required in basic 
and applied ecological research, but are difficult to obtain for many species, 
including some of the most critically endangered. A primary obstacle faced 
when estimating population density is that the number of individuals cap- 
tured or detected is an unknown fraction of the actual number present, A^. 
Capture-recapture (CR) methods yield estimates of N; however, the effec- 
tive area sampled, Ag is typically unknown, and thus density cannot be 
estimated (Dice, 1938; Wilson and Anderson, 1985). This is a well-known 
deficiency of traditional CR methods that limits their utility for making in- 
ferences beyond the indefinite region in which the sampling was conducted. 

The limitations of traditional CR methods extend beyond their inability 
to estimate density. For example, even if were known, CR estimators can 
be biased by heterogeneity in capture probability resulting from unmod- 
eled spatial variables. In particular, is it intuitive that individuals close to a 
trap are more likely to be captured than individuals further away. So-called 
spatial capture-recapture (SCR) models (Efford, 2004; Borchers and Efford, 
2008; Royle and Young, 2008; Royle et al., 2009; Gardner, Royle and Wegan, 
2009; Borchers, 2010) address these problems and produce direct estimates 
of density or population size for explicit spatial regions. This is accomplished 
by modeling the number and locations of individual activity centers as well 
as distance-related heterogeneity in capture probability. Information about 
activity centers comes from the spatial coordinates of the traps where indi- 
viduals were captured — data which have always been available but were 
rarely utilized until recently. 

Because SCR models overcome the limitations of CR methods without 
requiring additional data, they represent a major advance in efforts to es- 
timate population density, and their use is becoming widespread. However, 
use of such methods requires that all individuals are uniquely identifiable, 
which can be difficult to achieve in practice. In some cases, such as in point 
counts of birds, it is typically not even possible to identify individuals. In 
other cases, even when resources are available to obtain individual recogni- 
tion, the identity of many individuals often remains unknown. For example, 
in "camera trapping" studies (O'Connell, Nichols and Karanth, 2010), the 
resulting photographs are not always sufficient for identification due to sim- 
ilar markings among animals. In many other cases, no natural markings are 
present to aid recognition {e.g., fishers, mountain lions, deer), and capturing 
all individuals encountered may be too difficult or intrusive. 

In this paper, we present a model allowing for inference about density and 
population size when individuals cannot be uniquely identified. Our model 
exploits the spatial structure in observed counts induced by the spatial or- 
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ganization of count locations that are in close proximity to one another. The 
key to our approach is, rather than viewing the spatial correlation as an in- 
ferential obstacle, we utilize spatial correlation as direct information about 
spatial distribution and population size. We formulate the model in terms 
of a collection of latent trap- and individual-specific encounter frequencies 
and provide a Bayesian analysis of the model based on Markov chain Monte 
Carlo (MCMC). We demonstrate efficacy of the approach using a simula- 
tion study. The model is also applied to a bird survey data set collected on 
a 50-m grid of 105 point count locations. 

Our paper challenges two ingrained and preeminent paradigms in statis- 
tical ecology: First is that sample units should be structured so as to ensure 
independence of the observable random variable^ and second is that individ- 
ual information is needed to obtain estimates of population size and density^. 
Our proposed class of models directly refutes both of these paradigms and 
suggests whole new classes of sampling designs and statistical models for 
making inferences about animal demographic parameters. 

2. Sampling Design. We consider a sampling design in which R traps 
are operated on T occasions within a region S, which for simplicity we treat 
as two-dimensional, 5 C M^. Although we use the term "trap", anything 
capable of recording counts of unmarked individuals could be used, such 
as a camera or a human observer conducting a point count, i.e., a survey 
from a fixed point in space. The sample occasion can be an arbitrary time 
period, such as a single day in a camera trap study, or a 10-min survey 
interval. Trap locations are characterized by the coordinates of each trap, 
Xr = {xri,Xr2)'i ^ = 1, 2, . . . , i?. The data resulting from this design are the 
R X T matrix of counts, n^; t = 1,2, ... ,T. 

Unlike similar count-based sampling protocols, this design aims to induce 
correlation in the neighboring counts by organizing the trap locations suf- 
ficiently close together so that individual animals might be encountered at 
multiple locations. Thus, we do not make the customary assumptions that 
counts can be viewed as i.i.d. outcomes and that no movement occurs be- 
tween sampling occasions. In the following section we develop a hierarchical 
model that describes the process by which such correlation is generated. 

3. The Hierarchical Model and Data. To devise an inference frame- 
work for the observed counts, we specify a hierarchical model consisting of 
component models for the underlying ecological state process, which in the 

^Hurlbert (1984) has been cited >5000 times according to Google scholar 
■^Even classical distance sampling relies on sampling of unique individuals where identity 
is equated to a point in space 
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present case are the activity centers of N individuals, and then a separate 
model component describing the process of encountering individuals at each 
trap. 

3.1. State model. Suppose that N individuals occur within S and each 
individual has an associated activity center around which movements occur. 
For many species, this could be the center of a home range or a territory. 
We adopt a pragmatic definition of home range to be the space about which 
an individual moves during a specific time period (in our case, the inter- 
val from t = 1 to t = T). Denote the coordinates of this activity center 
Si = isii,Si2);i = 1,2, . . . , N . Since the activity centers cannot be directly 
observed, they are regarded as latent variables, which we model as outcomes 
of a spatial point process. In principle, general point process models could 
be considered, but here we model activity centers as 

Si ~ Uniform(S) 

which is equivalent to assuming 

511 ~ Uniform{0, B) 

512 ~ Uniform{0, B) 

if the state-space S is a BxB square. This homogeneous point process model 
is customary in most existing applications of SCR models (Royle et al., 2009; 
Gardner, Royle and Wegan, 2009). We discuss a square S for simplicity only 
— any polygon containing Xr could be used to define the state-space. In 
practice, it should be chosen large enough so that an individual's encounter 
rate is negligible if its activity center occurs on the edge of the polygon. 
This will depend on the specific observation model chosen (see below), and 
sometimes biological considerations (e.g., suitable habitat) may be used to 
determine S (Royle et al., 2009). 

3.2. Observation model. It is natural to regard the encounter rate of an 
individual as a function of the Euclidean distance between the individual's 
activity center and the trap location, dir = ||xr — Si|[. To be precise about 
this, we let Zirt be the encounter frequency of individual i in trap r during 
occasion t. While we will adopt the view that the variables Zirt are latent 
variables (see below), it will be convenient to formulate the model in terms 
of these variables. 

Therefore, we assume that the expected encounter frequency of an indi- 
vidual in some trap is related to dir as follows: 



E[zirt\ — Air — ^ohr 
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where Aq is the expected encounter rate at d = and kir is some positive- 
valued function of distance dir. We assume 

kir = exp{—dfr/2a^) 

where o" is a scale parameter related to home range size, a also determines 
the degree of correlation among counts since animals with large home ranges 
are more likely to be detected at multiple traps relative animals with small 
home ranges. The phenomenon is analogous to correlation induced by aver- 
aging spatial noise, in which case there is a unique correlation between the 
smoothing kernel and the induced covariance function (Higdon, 2002). 

We emphasize that our focus is on situations in which individuals are 
not uniquely identifiable, and therefore the encounter frequencies for each 
individual cannot be observed, and so they are latent variables. We assume 
that these latent variables are realizations from a Poisson distribution with 
mean Xir'. 

(3.1) Zirt ~ Poisson(Air). 

In traditional SCR models, Zirt are the observed data, i.e., the frequency of 
encounters of individual i at trap r on replicate survey t. However, when 
individual identity is not known, the observed data are the sample- and 
trap-specific totals, aggregated over all individuals: 

N 

TLrt — ^ ^ ^irt- 
1=1 

Thus the data required by our model are a reduced-information summary 
of the latent encounter histories. 

Under the Poisson encounter model we have that 

(3.2) rirt ~ Poisson(Ar) 
where 

i 

Further, because does not depend on t, we can aggregate the replicated 
counts, defining n^. = Urt and then 

rir. ~ Poisson (T A,.) 

As such, T and Aq serve equivalent roles as affecting baseline encounter 
rate. This formulation of the model in terms of the aggregate count simpli- 
fies computations as the latent variables Zirt do not need to be updated in 
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the MCMC estimation scheme (see below). However, retaining Zirt in the for- 
mulation of the model is important if some individuals are uniquely marked, 
in which case modifying the MCMC algorithm (see below) to include both 
types of data is trivial. This is because uniquely identifiable individuals pro- 
duce observations of some of tlie zi^^i variables. 

We imagine that other observation models might be possible (see Discus- 
sion) although we focus here on the Poisson encounter model because it has 
considerable relevance to animal surveys, and has additional methodological 
context related to point process models which we address in the Discussion. 

Although the model is naturally described conditional on N (i.e., in terms 
of N latent encounter frequencies Zjrt), in all practical applications N is 
unknown and, in fact, is the object of inference. We accommodate that 
is unknown using a Bayesian estimation scheme described in the following 
section. 

4. Estimation by MCMC. We adopt a Bayesian framework for infer- 
ence allowing estimation of N while retaining the formulation of the model 
that is conditional on the latent activity centers Sj. Specifically, we employ 
Markov chain Monte Carlo (MCMC) to simulate posterior distributions of 
the parameters. However, the fact that N is unknown presents a technical 
challenge because the size of the parameter space can change with each MC 
iteration. To resolve this, we adopt the formulation of data augmentation 
in Royle, Dorazio and Link (2007) who used a specific prior construction 
for N in terms of individual level Bernoulli trials. In particular, we assume 
N ~ Unif(0, M) for some large integer M. We construct this prior by as- 
suming N\M, (j) ~ Bin(M, 0) and (j) ~ DUnif(0, 1) which implies, marginally, 
that N has the requisite DUnif(0, M) distribution. However the hierarchical 
formulation of the prior suggests an implementation in which we introduce 
a set of latent indicator variables Wi ^ Bern((?l)) and, furthermore, the model 
implies that Zirt are obtained from the specified distribution (Eq. 3.1) if 
Wi = 1, ot: \i Wi = 0, Zirt = with probability 1. In effect, extending the 
model in this way induces a reparameterization for the latent counts that is 
a zero-infiated version of the original conditional-on-A model. Specifically, 
the model under data augmentation becomes 

Zirt\wi ~ Poisson ( Air Uij) 
Wi ~ Bern(i;^) 

Under this formulation N = "^ff^iWi, and population density is simply 
D = N/A{S) where A{S) is the area of the point process state-space S. 



SPATIAL MODELS FOR UNMARKED POPULATIONS 



7 



We developed two distinct MCMC implementations for this model (Supplement A). 
In the first, we devised an algorithm for the model conditional on the la- 
tent variables Zirt- This formulation is useful for problems in which one or 
more individual identities are available, in which case the Zi^t are observ- 
able for those individuals. The unobserved Zirt are easily updated using 
their full-conditional distribution which is multinomial with sample size rirt- 
The remaining parameters are updated using Metropolis-Hastings steps (see 
Supplement A). In the second formulation of the algorithm we applied the 
Metropolis-Hastings algorithm to the model unconditional on the Zirt vari- 
ables. In that case, the marginal distribution for n^t is precisely Eq. 3.2. This 
algorithm is slightly more convenient because it avoids having to update the 
Zirt variables of which there are many. 

5. Applications. 

5.1. Simulation studies. We carried out two simulation studies to evalu- 
ate the basic efficacy of the estimator. In the first study, all individuals were 
unmarked and we assessed posterior properties under varying degrees of cor- 
relation in the counts. In the second study, we measured the improvements 
in posterior precision obtained by marking a subset of the population. 

To investigate the effects of correlation, we used a 15 x 15 trap grid and 
simulated scenarios with a G (0.5,0.75,1.0). We selected these values be- 
cause clearly a should not be too small relative to the grid spacing or the 
counts are independent, i.e. the trap totals are then just i.i.d. Poisson ran- 
dom variables. Similarly, a should not be too large relative to trap spacing 
or else again the counts become i.i.d. Poisson random variables. We note 
that trap spacing is widely recognized as being relevant in the application 
of spatial capture-recapture models, where models require observations of 
individuals at multiple traps, although to this point in time little formal 
analysis of the design problem has been done. For the other parameters in 
the model we considered TAq € (2.5, 5.0) and N £ (27, 45, 75) individuals 
distributed on a 20 x 20 unit state-space centered over the 15 x 15 array 
of trap locations. This configuration implies a buffer of 3 units around the 
traps, which was sufficiently large to ensure that encounter rate was negligi- 
ble for the values of a considered. We fit the model to 100 datasets for each 
of the 18 scenarios. 

Results of our first simulation study indicate that for the small level of 
a, the posterior mode, if regarded as a point estimator of N, is approxi- 
mately frequentist unbiased (Table 1). However, the posterior distributions 
are skewed, which results in posterior means exhibiting frequentist bias 
on the order of 5-10%. Substantial reductions in root-mean-squared error 
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Table 1 

Simulation results showing the bias and precision of the posterior mean and mode for the 
population size parameter, N. Proportion of 95% credible intervals covering the data 
generating value is also reported. Ao = 0.5 for all cases. 
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22.174 


0.880 
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14.089 


0.870 
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87.802 


33.919 


71.835 


31.601 


0.940 






10 


83.184 


24.901 


72.205 


22.336 
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(RMSE) are realized as effective encounter rate doubles from 2.5 to 5.0 
(T=5 to T=10). Coverage is slightly less than nominal for this case. Per- 
formance of the estimator deteriorates as a increases. For a = 0.75 the 
posterior distributions are centered approximately over the data-generating 
value (having nearly frequentist unbiased modes), but the coverage is quite 
a bit less than nominal as the posterior becomes more strongly skewed. The 
general pattern holds for the highest level of u = 1.0. 

To assess the influence of marking a subset of individuals, we used the 
same number and configuration of traps as described above, and we set 
a = 0.5, Ao = 0.5, N = lb, and T = 5. Then, we generated 100 datasets for 
m G (5, 15, 25, 35) where m is the number of marked individuals randomly 
sampled from the population. 

Posterior distributions of for different numbers of marked individuals 
are shown in Fig. 1. As anticipated, posterior precision increases substan- 
tially with the proportion of marked individuals. The posterior mode was 
approximately unbiased as a point estimator, and the root-mean squared 
error decreased from 19.076 when all 75 individuals were unmarked to 6.398 
when 35 individuals were marked (Tables 1 and 2). Coverage was nominal 
for all values of m and posterior skew was greatly diminished (Table 2). 
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Fig 1. Overlaid posterior distributions of N from 100 simulations for four levels of marked 
individuals. 

5.2. Point count data. To apply our model to data collected in the field, 
we designed a point count study of the northern parula [Parula ameri- 
cana), a Neotropical-Nearctic migratory passerine. This species defends well- 
defined territories during the breeding season (Moldenhauer and Regelski, 
1996), and thus our modeling effort was focused on estimating the num- 
ber and location of territory centers. Points were located on a 50-m grid to 
ensure spatial correlation. This small grid spacing contrasts with the conven- 
tional practice of spacing points by > 200 m to obtain i.i.d. counts. Figure 2 
depicts the spatially-correlated counts (n^.) from the 105 point count loca- 
tions surveyed three times each during June 2006 at the Patuxent Wildlife 
Research Center in Laurel Maryland, USA. A total of 226 detections were 
made with a maximum count of 4 during a single survey. At 38 points, no 
warblers were detected. All but one of the detections were of singing males, 
and this one observation was not included in the analysis. 

In our analysis of the parula data, we defined the point process state-space 
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Fig 2. Spatially-correlated counts of northern parula on a 50-m grid. The size of the circle 
represents the total number of detections at each point. 



by buffering the grid of point count locations by 250 m and used M = 300. 
We simulated posterior distributions using three Markov chains, each con- 
sisting of 300000 iterations after discarding the initial 10000 draws. Conver- 
gence was satisfactory, as indicated by an R statistic of < 1.02 (Gelman and Rubin, 
1992). 

One benefit of a Bayesian analysis is that it can accommodate prior in- 
formation on the home range size and encounter rate parameters, which are 
readily available for many species. To illustrate, we analyzed the parula data 
using two sets of priors. In the first set, all priors were improper, customary 
non- informative priors (see Table 3). Uniform priors were also used in the 
second set, with the exception of an informative prior for the scale parameter 
cr ~ Gamma(13, 10). We arrived at this prior using the methods described 
by Royle, Kery and Guelat (2011) and published information on the war- 
bler's home range size and detection probability (Moldenhauer and Regelski, 
1996; Simons et al., 2009). More details on this derivation are found in 
Supplement A. We briefly note here that this prior includes the biologically- 

Table 2 

Posterior mean, mode, and associated RMSE for simulations in which m of N = 75 
individuals were marked. One hundred simulations of each case were conducted. 

Mean RMSE Mode RMSE Coverage 

m=5 80.096 13.948 76.270 13.635 0.980 

m=15 78.763 11.548 76.110 10.964 0.940 

m=25 77.658 8.826 75.810 8.562 0.950 

m=35 76.385 6.453 74.900 6.398 1.000 
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Table 3 

Posterior summary statistics for spatial Poisson-count model applied to the northern 
parula data. Two sets of priors were considered. M = 300 was used in both cases. 
Parulas/ha, D, is a derived parameter. 



Par 


Prior 


Mean 


SD 


Mode 


q0.025 


q0.50 


qO.975 


a 


f/(0,oo) 


2.154 


1.222 


1.230 


0.896 


1.665 


5.170 


Ao 


f/(0,oo) 


0.284 


0.149 


0.212 


0.084 


0.256 


0.665 


N 


U{0, M) 


40.953 


38.072 


4.000 


3.000 


31.000 


143.000 


D 




0.427 


0.397 


0.0417 


0.0313 


0.323 


1.490 


a 


G(13, 10) 


1.301 


0.258 


1.230 


0.889 


1.266 


1.908 


Ao 


f/(0,oo) 


0.298 


0.132 


0.240 


0.098 


0.279 


0.603 


N 


f/(0, M) 


59.321 


36.489 


36.000 


18.000 


50.000 


157.000 


D 




0.618 


0.380 


0.375 


0.188 


0.521 


1.635 



plausible range of values from a suggested by the published literature. 

The posterior distribution for N was highly skewed with a long right tail 
resulting in a wide 95% credible interval (Table 3). Nonetheless, the interval 
for density, D, includes estimates reported from more intensive field studies 
(Moldenhauer and Regelski, 1996). This was true when considering both sets 
of priors, although posterior precision was higher under the informative set of 
priors. Specifically, the use of prior information reduced posterior density at 
high, biologically implausible, values of cr, and hence decreased the posterior 
mass for low values of N (Fig. 3). 

In addition to estimating density, our model can be used to produce den- 
sity surface maps, which are often used in applied ecological research to 
direct management efforts and develop hypotheses regarding the factors in- 
fluencing abundance. Density surface maps can be produced by discretized 
the state-space and tallying the number of activity centers occurring in each 
pixel during each MCMC iteration. Parula density was highest near the 
northeastern corner of the study plot, which may correspond to important 
habitat features such as suitable nest site locations (Fig. 4). We anticipate 
future model extensions to directly model the point process intensity using 
habitat covariates. 

6. Discussion. In this paper, we confronted one of the most difficult 
challenges faced in wildlife sampling — estimation of density in the absence 
of data to distinguish among individuals. To do so, we developed a novel 
class of spatially-explicit models that applies to spatially organized counts, 
where the count locations or devices are located sufficiently close together 
so that individuals are exposed to encounter at multiple devices. This design 
yields correlation in the observed counts, and this correlation proves to be 
informative about encounter probability parameters and hence density. We 
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Fig 3. Effects of a ~ Gamma{13, 10) prior on the posterior distributions from the northern 
parula model. Posteriors from model with uniform priors are shown in black, and posteriors 
from the informative prior model are shown in gray. The prior itself is shown as dotted 
line in the upper panel. 
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Fig 4. Estimated density surface of northern parula activity centers. The grid of point 
count locations with count totals is superimposed. See Fig. 1 for additional details. 
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note that sample locations in count-based studies are typically not organized 
close together in space because conventional wisdom and standard practice 
dictate that independence of sample units is necessary (Hurlbert, 1984). Our 
model suggests that in some cases it might be advantageous to deviate from 
the conventional wisdom if one is interested in direct inference about density. 
Of course, this is also known in the application of standard spatial capture- 
recapture models (Borchers and Efford, 2008) where individual identity is 
preserved across trap encounters, but it is seldom, if ever, considered in the 
design of more traditional count surveys. 

Our model has broad relevance to an incredible number of animal sam- 
pling problems. Our motivating problem involved bird point counts where 
individual identity is typically not available. The model also applies to other 
standard methods used to sample unmarked populations, such as camera 
traps or even methods that yield sign {e.g. scat, track) counts indexed by 
space. However, results of our simulation study reveal some important lim- 
itations of the basic estimator applied to situations in which none of the 
individuals can be uniquely identified. In particular, posterior distributions 
are highly skewed in typical small to moderate sample size situations and 
posterior precision is low. 

Several modifications of the model can lead to improved performance of 
the estimator. Our simulation results demonstrate that marking a subset of 
individuals can yield substantial increases in posterior precision. Marking 
a subset of individuals is commonplace is animal studies such as when a 
small number of individuals are radio-collared in conjunction with a count- 
based survey (Bartmann et al., 1987). In many other situations a subset of 
individuals can be identified by natural marks alone, and thus our model 
could be applied to data from camera-trapping studies of species such as 
mountain lions, deer, coyotes for which traditional SCR methods are not 
effective (Kelly et al., 2008). Thus, the ability to study partially- marked 
populations adds flexibility to existing SCR methods, and also creates new 
opportunities for designing efficient SCR studies since the costs of marking 
all individuals in a population can be prohibitive. 

We note the existence of traditional approaches to combining data on 
marked and unmarked animals based on either the Lincoln-Peterson estima- 
tor or so-called "mark-resight" methods. (Bartmann et al., 1987; Minta and Mang 
1989; McCIintock and Hoeting, 2009). In their simplest form, mark-resight 
methods involve fitting standard closed-population mark-recapture models 
to the data on marked individuals, and the resultant estimate of detection 
probability (p) is used to estimate population size as N = m + u/p where 
m and u are the number of marked and unmarked individual, respectively. 
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In this case, the unmarked individuals provide no information about the 
encounter rate parameters, and thus mark-resight methods cannot be used 
unless a large sample of marked individuals is available. This contrasts with 
our approach which can be used even when all individuals are unmarked. 

In some cases, such as in point counts of birds, it may not be practical 
to mark individuals. An alternative to increasing posterior precision is to 
utilize prior information on home range size. Indeed, extensive information 
on home range size has been compiled for many species in diverse habitats 
{e.g., DeGraaf and Yamasaki, 2001). It is easy to embody this information 
in a prior distribution as we demonstrated for the parula data. 

An additional design extension that could increase precision is to use 
multiple sampling methods, in which one method generates encounter fre- 
quencies and the other method generates individuality. For example, camera 
traps are now commonly used with surveys for sign (scat or tracks), or hair 
snares for sampling bear populations. These distinct methods would have 
different basal detection rates but share an underlying spatial model de- 
scribing the organization of individuals in space. Our models show promise 
for using these disparate data types efficiently for estimating density. 

6.1. N -mixture models. Parallel developments which appear ostensibly 
orthogonal to SCR models have addressed the problem of estimating pop- 
ulation size when individuals are unmarked. So-called A^-mixture models 
(Royle, 2004b, a; Royle, Dawson and Bates, 2004) can be applied to a repeated- 
measures type of data structure wherein data are collected at R sites, with 
J replicate surveys are conducted at each. A^-mixture models regard abun- 
dance at each site (A^) as an i.i.d. realization of a discrete distribution such 
as the Poisson or negative binomial with expectation 0. In the standard 
binomial A^-mixture model, the observed counts are treated as binomial 
outcomes with A^^ "trials" and detection probability p. 

Although these models have proven useful for studies of factors that affect 
variation in abundance, interpretation of model parameters is strongly de- 
pendent on the assumption that populations are closed with respect to demo- 
graphic processes and movement. The closure assumption can be an impor- 
tant practical limitation (but see Dail and Madsen, 2011; Chandler, Royle and King, 
2011). Furthermore the i.i.d. assumption is violated if spatial correlation ex- 
ists among sites, such as if animals move among plots. Although we formu- 
lated the model developed in our paper as an extension of spatially explicit 
capture-recapture models, it clearly can also be viewed as a spatially ex- 
plicit extension of A^-mixture models where the local population sizes A^^ 
are dependent owing to the nature of the sampling design. 
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Thus, two recently developed methodological frameworks, spatial capture- 
recapture and A^-mixture models, address different problems that arise in 
sampling animal populations. SCR models address non-closure by accommo- 
dating information on the spatial organization of individuals and juxtapo- 
sition of individuals with traps, and A^-mixture models address inability to 
uniquely identify individuals. Our model unifies these two modeling frame- 
works by addressing both issues simultaneously. 

6.2. Alternative Observation Models. Several aspects of our "spatial Ad- 
mixture model" can be modified to accommodate alternative sampling de- 
signs or parametric distributions. We considered situations where an indi- 
vidual can be detected more than once at a trap during a single occasion, 
but under some designs this is not possible. When collecting DNA samples, 
for instance, an individual can often be detected at most once during an oc- 
casion, because multiple samples of biological material cannot be attributed 
to distinct episodes. Therefore, rather than Zirt ~ Poisson{Xir) we have 
Zirt ~ Bernoulli{pir) where, for example, pir = poexp{—d^^/{2a'^)), and po 
is the probability of detecting an individual whose home range is centered 
on trap r. This Bernoulli model is a focus of ongoing investigations. 

Both the Poisson and the Bernoulli models produce count observations 
when aggregated over individuals to form trap-specific totals; however, ecol- 
ogists often collect so-called "detection/non-detection" data because it can 
be easier to determine if "at least one" individual was present rather than 
enumerating all individuals in a location. In this case, the underlying Zirt 
array is the same as the above cases, but we observe yrt = iCl^iLi ^irt > 0) 
where / is the indicator function. This "Poisson-binary model" is a spatially 
explicit extension of the model of Royle and Nichols (2003) in which the un- 
derlying abundance state is inferred from binary data. We have investigated 
this model to a limited extent but do not report on those results here. 

6.3. Spatial point process models. Our model has some direct linkages 
to existing point process models. We note that the observation intensity 
function (i.e., corresponding to the observation locations) is a compound 
Gaussian kernel similar to that of the Thomas process (Thomas, 1949; 
M0ller and Waagepetersen, 2003, pp. 61-62). Also, the Poisson-Gamma Con- 
volution models (Wolpert and Ickstadt, 1998) are structurally similar (see 
also Higdon (1998) and Best, Ickstadt and Wolpert (2000)). In particular, 
our model is such a model but with a constant basal encounter rate Aq and 
unknown number and location of "support points", which in our case are 
the animal activity centers, Si. We can thus regard our model as a model for 
estimating the location and local density of support points in such models, 
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which we beheve could be useful in the application of convolution mod- 
els. Best, Ickstadt and Wolpert (2000) devise an MCMC algorithm for the 
Poisson-Gamma model based on data augmentation, which is similar to the 
component of our algorithm for updating the z variables in the conditional- 
on-2 formulation of the model. We emphasize that our model is distinct 
from these Poisson-Gamma models in that the number and location of such 
support points are estimated. 

If individuals were perfectly observable then the resulting point process of 
locations is clearly a standard Poisson or Binomial (fixed A^) cluster process 
or Neyman-Scott process. If detection is uniform over space but imperfect, 
then the basic process is unaffected by this random thinning. Our model can 
therefore be viewed formally as a Poisson (or Binomial) cluster process model 
but one in which the thinning is non-uniform, governed by the encounter 
model which dictates that thinning rate increases with distance from the 
observation points. In addition, our inference objective is, essentially, to 
estimate the number of parents in the underlying Poisson cluster process, 
where the observations are biased by an incomplete sampling apparatus 
(points in space). 

As a model of a thinned point process, our model has much in common 
with classical distance sampling models (Buckland et al., 2001). The main 
distinction is that our data structure does not include observed distances, 
although the underlying observation model is fundamentally the same as 
in distance sampling if there is only a single replicate sample and Sj is de- 
fined as an individual's location at an instant in time. For replicate samples, 
our model preserves (latent) individuality across samples and traps which 
is not a feature of distance sampling. We note that error in measurement of 
distance is not a relevant consideration in our model, and we explicitly do 
not require the standard distance sampling assumption that the probability 
of detection is 1 if an individual occurs at the survey point. More impor- 
tantly, distance sampling models cannot be applied to data from many of 
the sampling designs for which our model is relevant. For example, many 
rare and endangered species can only be effectively surveyed using methods 
such as hair snares and camera traps that do not produce distance data 
(O'Connell, Nichols and Karanth, 2010). 

7. Conclusion. Concerns about "statistical independence" have prompted 
ecologists to design count-based studies such that observed random variables 
can be regarded as i.i.d. outcomes (Hurlbert, 1984). Interestingly, this of- 
ten proves impossible in practice, and elaborate methods have been devised 
to model spatial dependence as a nuisance parameter. Our paper presents a 
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modeling framework that directly confronts this view by demonstrating that 
spatial correlation carries information about the locations of individuals, 
which can be used to estimate density even when individuals are unmarked 
and distance-related heterogeneity exists in encounter probability. 
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SUPPLEMENTARY MATERIAL 

Supplement A: R code and parula dataset 

(http://lib.stat. cmu.edu/aoas/???/???). Includes MCMC algorithms, data 
simulator, northern parula dataset, and a description of method used to 
obtain the informative prior used in the analysis of the parula data. 
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